Search results for " feature selection"
showing 10 items of 10 documents
Prototyping Crop Traits Retrieval Models for CHIME: Dimensionality Reduction Strategies Applied to PRISMA Data
2022
In preparation for new-generation imaging spectrometer missions and the accompanying unprecedented inflow of hyperspectral data, optimized models are needed to generate vegetation traits routinely. Hybrid models, combining radiative transfer models with machine learning algorithms, are preferred, however, dealing with spectral collinearity imposes an additional challenge. In this study, we analyzed two spectral dimensionality reduction methods: principal component analysis (PCA) and band ranking (BR), embedded in a hybrid workflow for the retrieval of specific leaf area (SLA), leaf area index (LAI), canopy water content (CWC), canopy chlorophyll content (CCC), the fraction of absorbed photo…
A new feature selection strategy for K-mers sequence representation
2014
DNA sequence decomposition into k-mers (substrings of length k) and their frequency counting, defines a mapping of a sequence into a numerical space by a numerical feature vector of fixed length. This simple process allows to compute sequence comparison in an alignment free way, using common similarities and distance functions on the numerical codomain of the mapping. The most common used decomposition uses all the substrings of length k making the codomain of exponential dimension. This obviously can affect the time complexity of the similarity computation, and in general of the machine learning algorithm used for the purpose of sequence classification. Moreover, the presence of possible n…
Design and Prototyping of a Smart University Campus
2019
The authors propose a framework to support the “smart planning” of a university environment, intended as a “smart campus.” The main goal is to improve the management, storage, and mining of information coming from the university areas and main players. The platform allows for interaction with the main players of the system, generating and displaying useful data in real time for a better user experience. The proposed framework provides also a chat assistant able to respond to user requests in real time. This will not only improve the communication between university environment and students, but it allows one to investigate on their habits and needs. Moreover, information collected from the …
Local Feature Selection with Dynamic Integration of Classifiers
2000
Multidimensional data is often feature space heterogeneous so that individual features have unequal importance in different sub areas of the feature space. This motivates to search for a technique that provides a strategic splitting of the instance space being able to identify the best subset of features for each instance to be classified. Our technique applies the wrapper approach where a classification algorithm is used as an evaluation function to differentiate between different feature subsets. In order to make the feature selection local, we apply the recent technique for dynamic integration of classifiers. This allows to determine which classifier and which feature subset should be us…
A Comparative Study on Feature Selection for Retinal Vessel Segmentation Using FABC
2009
This paper presents a comparative study on five feature selection heuristics applied to a retinal image database called DRIVE. Features are chosen from a feature vector (encoding local information, but as well information from structures and shapes available in the image) constructed for each pixel in the field of view (FOV) of the image. After selecting the most discriminatory features, an AdaBoost classifier is applied for training. The results of classifications are used to compare the effectiveness of the five feature selection methods.
Variability of Classification Results in Data with High Dimensionality and Small Sample Size
2021
The study focuses on the analysis of biological data containing information on the number of genome sequences of intestinal microbiome bacteria before and after antibiotic use. The data have high dimensionality (bacterial taxa) and a small number of records, which is typical of bioinformatics data. Classification models induced on data sets like this usually are not stable and the accuracy metrics have high variance. The aim of the study is to create a preprocessing workflow and a classification model that can perform the most accurate classification of the microbiome into groups before and after the use of antibiotics and lessen the variability of accuracy measures of the classifier. To ev…
A New Feature Selection Methodology for K-mers Representation of DNA Sequences
2015
DNA sequence decomposition into k-mers and their frequency counting, defines a mapping of a sequence into a numerical space by a numerical feature vector of fixed length. This simple process allows to compare sequences in an alignment free way, using common similarities and distance functions on the numerical codomain of the mapping. The most common used decomposition uses all the substrings of a fixed length k making the codomain of exponential dimension. This obviously can affect the time complexity of the similarity computation, and in general of the machine learning algorithm used for the purpose of sequence analysis. Moreover, the presence of possible noisy features can also affect the…
Input Selection Methods for Soft Sensor Design: A Survey
2020
Soft Sensors (SSs) are inferential models used in many industrial fields. They allow for real-time estimation of hard-to-measure variables as a function of available data obtained from online sensors. SSs are generally built using industries historical databases through data-driven approaches. A critical issue in SS design concerns the selection of input variables, among those available in a candidate dataset. In the case of industrial processes, candidate inputs can reach great numbers, making the design computationally demanding and leading to poorly performing models. An input selection procedure is then necessary. Most used input selection approaches for SS design are addressed in this …
3D DCE-MRI Radiomic Analysis for Malignant Lesion Prediction in Breast Cancer Patients
2022
Rationale and Objectives: To develop and validate a radiomic model, with radiomic features extracted from breast Dynamic Contrast-Enhanced Magnetic Resonance Imaging (DCE-MRI) from a 1.5T scanner, for predicting the malignancy of masses with enhancement. Images were acquired using an 8-channel breast coil in the axial plane. The rationale behind this study is to show the feasibility of a radio-mics-powered model that could be integrated into the clinical practice by exploiting only standard-of-care DCE-MRI with the goal of reducing the required image pre-processing (ie, normalization and quantitative imaging map generation).Materials and Methods: 107 radiomic features were extracted from a …
Variable Ranking Feature Selection for the Identification of Nucleosome Related Sequences
2018
Several recent works have shown that K-mer sequence representation of a DNA sequence can be used for classification or identification of nucleosome positioning related sequences. This representation can be computationally expensive when k grows, making the complexity in spaces of exponential dimension. This issue effects significantly the classification task computed by a general machine learning algorithm used for the purpose of sequence classification. In this paper, we investigate the advantage offered by the so-called Variable Ranking Feature Selection method to select the most informative k − mers associated to a set of DNA sequences, for the final purpose of nucleosome/linker classifi…